-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
School Escorting Runtime Optimization #828
School Escorting Runtime Optimization #828
Conversation
os.path.join(data_dir, "create_bundle_attributes_inbound__input.pkl") | ||
) | ||
inbound_expected = pd.read_pickle( | ||
os.path.join(data_dir, "create_bundle_attributes_inbound__output.pkl") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use parquet instead of pickle for test data? Pickle is proven to have compatibility issues when changing version of Python or dependency (like pandas).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed during the meeting today, the parquet format does not preserve the dtype of the column like Pickle does. This is relevant in particular for the school escorting model because some columns contain arrays of values (like escorting participants or the school destinations of the escort bundle for example). These arrays are not stored appropriately by the parquet format.
The long-term solution agreed upon was to push this issue of parquet compatability (which I think happens in a couple of other places besides the school escort model) to the data model.
In order to make sure that this isn't an issue that was going to pop up immediately, the request was to test with pandas 2. I can confirm that this unit test passes with the latest version of Pandas v2.2.2.
PR for #822